{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "1755d1f0-f19f-4438-a249-f8c3d487543e",
   "metadata": {},
   "source": [
    "# LlamaIndex - Starter Tutorial (Local Models)\n",
    "\n",
    ":::{note}\n",
    "This tutorial is based on LlamaIndexofficial docs tutorial ([https://docs.llamaindex.ai/en/stable/getting_started/starter_example_local/](https://docs.llamaindex.ai/en/stable/getting_started/starter_example_local/#starter-tutorial-local-models))\n",
    ":::"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f597b254-c829-4e34-8480-f0aba098204d",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Download data and folder setup\n",
    "\n",
    "On the **Docker host side**, run the following to set up the `jetson-containers`' `/data` directory.\n",
    "\n",
    "```bash\n",
    "cd jetson-containers\n",
    "mkdir -p data/documents/L4T-README\n",
    "cp /media/jetson/L4T-README/*.txt data/documents/L4T-README/\n",
    "```\n",
    "\n",
    "This in turn creates the mounted volume `/data/documents/L4T-README` inside the container.<br> \n",
    "Your directory structure should look like this:\n",
    "\n",
    "```\n",
    "└── ./data/documents/L4T-README\n",
    "    ├── INDEX.txt\n",
    "    ├── README-usb-dev-mode.txt\n",
    "    ├── README-vnc.txt\n",
    "    └── README-wifi.txt\n",
    "```\n",
    "\n",
    "You can check this with running a following bash command in the following cell."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "56b07549-9383-456d-8ffb-216fa8bd3cf3",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/data/documents/L4T-README:\n",
      "total 24\n",
      "-rw-rw-r-- 1 1000 1000  1104 May  6 22:42 INDEX.txt\n",
      "-rw-rw-r-- 1 1000 1000 11126 May  6 22:42 README-usb-dev-mode.txt\n",
      "-rw-rw-r-- 1 1000 1000  3590 May  6 22:42 README-vnc.txt\n",
      "-rw-rw-r-- 1 1000 1000  1940 May  6 22:42 README-wifi.txt\n"
     ]
    }
   ],
   "source": [
    "!ls -Rl /data/documents/L4T-README"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e191cda6",
   "metadata": {},
   "source": [
    "## Download the Llama2 model\n",
    "\n",
    "Check if you have Llama2 model downloaded."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "0456c238-5450-422f-8936-b6a07d82fb3b",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "NAME                    \tID          \tSIZE  \tMODIFIED     \n",
      "mxbai-embed-large:latest\t468836162de7\t669 MB\t10 hours ago\t\n",
      "llama3:70b              \tbe39eb53a197\t39 GB \t25 hours ago\t\n",
      "llama3:latest           \ta6990ed6be41\t4.7 GB\t25 hours ago\t\n",
      "nomic-embed-text:latest \t0a109f422b47\t274 MB\t32 hours ago\t\n",
      "llama2:latest           \t78e26419b446\t3.8 GB\t11 days ago \t\n",
      "mistral:latest          \t61e88e884507\t4.1 GB\t2 weeks ago \t\n",
      "llama2:70b              \te7f6c06ffef4\t38 GB \t2 weeks ago \t\n",
      "llama2:13b              \td475bf4c50bc\t7.4 GB\t2 weeks ago \t\n"
     ]
    }
   ],
   "source": [
    "!ollama list"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "69b8093f-31a8-4687-aaf0-260f90b5f34a",
   "metadata": {},
   "source": [
    "If not, run the following cell to download the Llama2 model using `ollama` command."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c627300c",
   "metadata": {},
   "outputs": [],
   "source": [
    "!ollama pull llama3"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b353751b-c598-4388-9680-bcabe7ffa9b2",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Load data and build an index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "2634dfa3-1263-42b9-901d-004f5166e38e",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings\n",
    "from llama_index.llms.ollama import Ollama\n",
    "\n",
    "import logging\n",
    "import sys\n",
    "# import llama_index.core\n",
    "\n",
    "# llama_index.core.set_global_handler(\"simple\")\n",
    "logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)\n",
    "logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "a7be6e88-dfde-432c-be4d-f843b34336dd",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DEBUG:llama_index.core.readers.file.base:> [SimpleDirectoryReader] Total files added: 4\n",
      "> [SimpleDirectoryReader] Total files added: 4\n"
     ]
    }
   ],
   "source": [
    "reader = SimpleDirectoryReader(input_dir=\"/data/documents/L4T-README/\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "85d110b4-2a30-4f0c-af81-85a41d720d49",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DEBUG:fsspec.local:open file: /data/documents/L4T-README/INDEX.txt\n",
      "open file: /data/documents/L4T-README/INDEX.txt\n",
      "DEBUG:fsspec.local:open file: /data/documents/L4T-README/README-usb-dev-mode.txt\n",
      "open file: /data/documents/L4T-README/README-usb-dev-mode.txt\n",
      "DEBUG:fsspec.local:open file: /data/documents/L4T-README/README-vnc.txt\n",
      "open file: /data/documents/L4T-README/README-vnc.txt\n",
      "DEBUG:fsspec.local:open file: /data/documents/L4T-README/README-wifi.txt\n",
      "open file: /data/documents/L4T-README/README-wifi.txt\n"
     ]
    }
   ],
   "source": [
    "documents = reader.load_data()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "7d67b299-fdb8-469f-950c-070220b48965",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(documents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "315b478d-c721-4863-83fc-ea57caa73d8b",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from llama_index.embeddings.ollama import OllamaEmbedding\n",
    "\n",
    "Settings.embed_model = OllamaEmbedding(model_name=\"mxbai-embed-large\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "4cad9359-8c60-46a9-8acb-ca0ad1b79abc",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# ollama\n",
    "Settings.llm = Ollama(model=\"llama3\", request_timeout=300.0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "b9dd0b9e-cf1d-48f3-8e4d-75f7fa7130c9",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Enlarge the chunk size from the default 1024\n",
    "Settings.chunk_size = 800\n",
    "Settings.chunk_overlap = 50"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "ccd18df3-d52a-4002-8884-64c79401849b",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: 1==============================================...\n",
      "> Adding chunk: 1==============================================...\n",
      "DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: 2==============================================...\n",
      "> Adding chunk: 2==============================================...\n",
      "DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: Ethernet on Linux\n",
      "----------------------------...\n",
      "> Adding chunk: Ethernet on Linux\n",
      "----------------------------...\n",
      "DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: You may log into the system and run shell comma...\n",
      "> Adding chunk: You may log into the system and run shell comma...\n",
      "DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: 3==============================================...\n",
      "> Adding chunk: 3==============================================...\n",
      "DEBUG:llama_index.core.node_parser.node_utils:> Adding chunk: 4==============================================...\n",
      "> Adding chunk: 4==============================================...\n",
      "DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434\n",
      "Starting new HTTP connection (1): localhost:11434\n",
      "DEBUG:urllib3.connectionpool:http://localhost:11434 \"POST /api/embeddings HTTP/1.1\" 200 None\n",
      "http://localhost:11434 \"POST /api/embeddings HTTP/1.1\" 200 None\n",
      "DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434\n",
      "Starting new HTTP connection (1): localhost:11434\n",
      "DEBUG:urllib3.connectionpool:http://localhost:11434 \"POST /api/embeddings HTTP/1.1\" 200 None\n",
      "http://localhost:11434 \"POST /api/embeddings HTTP/1.1\" 200 None\n",
      "DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434\n",
      "Starting new HTTP connection (1): localhost:11434\n",
      "DEBUG:urllib3.connectionpool:http://localhost:11434 \"POST /api/embeddings HTTP/1.1\" 200 None\n",
      "http://localhost:11434 \"POST /api/embeddings HTTP/1.1\" 200 None\n",
      "DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434\n",
      "Starting new HTTP connection (1): localhost:11434\n",
      "DEBUG:urllib3.connectionpool:http://localhost:11434 \"POST /api/embeddings HTTP/1.1\" 200 None\n",
      "http://localhost:11434 \"POST /api/embeddings HTTP/1.1\" 200 None\n",
      "DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434\n",
      "Starting new HTTP connection (1): localhost:11434\n",
      "DEBUG:urllib3.connectionpool:http://localhost:11434 \"POST /api/embeddings HTTP/1.1\" 200 None\n",
      "http://localhost:11434 \"POST /api/embeddings HTTP/1.1\" 200 None\n",
      "DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434\n",
      "Starting new HTTP connection (1): localhost:11434\n",
      "DEBUG:urllib3.connectionpool:http://localhost:11434 \"POST /api/embeddings HTTP/1.1\" 200 None\n",
      "http://localhost:11434 \"POST /api/embeddings HTTP/1.1\" 200 None\n"
     ]
    }
   ],
   "source": [
    "index = VectorStoreIndex.from_documents(documents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "5cd69846-fd2d-408b-bb61-4fbc28e85f55",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DEBUG:fsspec.local:open file: /opt/llama-index/storage/docstore.json\n",
      "open file: /opt/llama-index/storage/docstore.json\n",
      "DEBUG:fsspec.local:open file: /opt/llama-index/storage/index_store.json\n",
      "open file: /opt/llama-index/storage/index_store.json\n",
      "DEBUG:fsspec.local:open file: /opt/llama-index/storage/graph_store.json\n",
      "open file: /opt/llama-index/storage/graph_store.json\n",
      "DEBUG:fsspec.local:open file: /opt/llama-index/storage/default__vector_store.json\n",
      "open file: /opt/llama-index/storage/default__vector_store.json\n",
      "DEBUG:fsspec.local:open file: /opt/llama-index/storage/image__vector_store.json\n",
      "open file: /opt/llama-index/storage/image__vector_store.json\n"
     ]
    }
   ],
   "source": [
    "index.storage_context.persist()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "5212c1ac-8341-470d-b942-af4971dd9e87",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4.0K\t./storage/index_store.json\n",
      "132K\t./storage/default__vector_store.json\n",
      "4.0K\t./storage/graph_store.json\n",
      "4.0K\t./storage/image__vector_store.json\n",
      "32K\t./storage/docstore.json\n",
      "180K\t./storage\n"
     ]
    }
   ],
   "source": [
    "!du -ah ./storage"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6934e64e-4fa2-4a12-af1e-7a582ed63fef",
   "metadata": {},
   "source": [
    "## Query your data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "6be1d7cd-7fe1-4a5f-a498-5bf486a38c58",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "query_engine = index.as_query_engine()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "3f23e03d-acf3-4dc3-9416-d6c57424e356",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost:11434\n",
      "Starting new HTTP connection (1): localhost:11434\n",
      "DEBUG:urllib3.connectionpool:http://localhost:11434 \"POST /api/embeddings HTTP/1.1\" 200 None\n",
      "http://localhost:11434 \"POST /api/embeddings HTTP/1.1\" 200 None\n",
      "DEBUG:llama_index.core.indices.utils:> Top 2 nodes:\n",
      "> [Node 736f1eb2-a778-4e79-880f-a814e11c5d81] [Similarity score:             0.694787] 2======================================================================\n",
      "                        ...\n",
      "> [Node 029afa15-bbd3-4423-b1bc-3b78ea33c911] [Similarity score:             0.595595] Ethernet on Linux\n",
      "----------------------------------------------------------------------\n",
      "Two US...\n",
      "> Top 2 nodes:\n",
      "> [Node 736f1eb2-a778-4e79-880f-a814e11c5d81] [Similarity score:             0.694787] 2======================================================================\n",
      "                        ...\n",
      "> [Node 029afa15-bbd3-4423-b1bc-3b78ea33c911] [Similarity score:             0.595595] Ethernet on Linux\n",
      "----------------------------------------------------------------------\n",
      "Two US...\n",
      "DEBUG:httpx:load_ssl_context verify=True cert=None trust_env=True http2=False\n",
      "load_ssl_context verify=True cert=None trust_env=True http2=False\n",
      "DEBUG:httpx:load_verify_locations cafile='/usr/local/lib/python3.10/dist-packages/certifi/cacert.pem'\n",
      "load_verify_locations cafile='/usr/local/lib/python3.10/dist-packages/certifi/cacert.pem'\n",
      "DEBUG:httpcore.connection:connect_tcp.started host='localhost' port=11434 local_address=None timeout=300.0 socket_options=None\n",
      "connect_tcp.started host='localhost' port=11434 local_address=None timeout=300.0 socket_options=None\n",
      "DEBUG:httpcore.connection:connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0xfffe99dc7640>\n",
      "connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0xfffe99dc7640>\n",
      "DEBUG:httpcore.http11:send_request_headers.started request=<Request [b'POST']>\n",
      "send_request_headers.started request=<Request [b'POST']>\n",
      "DEBUG:httpcore.http11:send_request_headers.complete\n",
      "send_request_headers.complete\n",
      "DEBUG:httpcore.http11:send_request_body.started request=<Request [b'POST']>\n",
      "send_request_body.started request=<Request [b'POST']>\n",
      "DEBUG:httpcore.http11:send_request_body.complete\n",
      "send_request_body.complete\n",
      "DEBUG:httpcore.http11:receive_response_headers.started request=<Request [b'POST']>\n",
      "receive_response_headers.started request=<Request [b'POST']>\n",
      "DEBUG:httpcore.http11:receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Content-Type', b'application/json; charset=utf-8'), (b'Date', b'Wed, 15 May 2024 07:53:13 GMT'), (b'Content-Length', b'1163')])\n",
      "receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Content-Type', b'application/json; charset=utf-8'), (b'Date', b'Wed, 15 May 2024 07:53:13 GMT'), (b'Content-Length', b'1163')])\n",
      "INFO:httpx:HTTP Request: POST http://localhost:11434/api/chat \"HTTP/1.1 200 OK\"\n",
      "HTTP Request: POST http://localhost:11434/api/chat \"HTTP/1.1 200 OK\"\n",
      "DEBUG:httpcore.http11:receive_response_body.started request=<Request [b'POST']>\n",
      "receive_response_body.started request=<Request [b'POST']>\n",
      "DEBUG:httpcore.http11:receive_response_body.complete\n",
      "receive_response_body.complete\n",
      "DEBUG:httpcore.http11:response_closed.started\n",
      "response_closed.started\n",
      "DEBUG:httpcore.http11:response_closed.complete\n",
      "response_closed.complete\n",
      "DEBUG:httpcore.connection:close.started\n",
      "close.started\n",
      "DEBUG:httpcore.connection:close.complete\n",
      "close.complete\n"
     ]
    }
   ],
   "source": [
    "response = query_engine.query(\n",
    "    \"What IPv4 address Jetson device gets assigned when connected to a PC with a USB cable? \\\n",
    "    And what file to edit in order to change the IP address to be assigned to Jetson itself in USB device mode? \\\n",
    "    Plesae state which section you find the answer for each question.\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "5fc566dd-e2c7-41d9-8f9e-3de55cfde2d7",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "<b>What IPv4 address Jetson device gets assigned when connected to a PC with a USB cable?\n",
       "\n",
       "According to the context information, under the \"Ethernet\" protocol, Linux for Tegra assigns a static IPv4 address of 192.168.55.1 to Jetson, and runs a DHCP server to automatically assign an IPv4 address of 192.168.55.100 to your host machine (Section: \"Linux for Tegra implements two types of Ethernet devices...\").\n",
       "\n",
       "So, the answer is: 192.168.55.1\n",
       "\n",
       "What file to edit in order to change the IP address to be assigned to Jetson itself in USB device mode?\n",
       "\n",
       "According to the context information, you can edit the file \"/opt/nvidia/l4t-usb-device-mode/nv-l4t-usb-device-mode-config.sh\" on Jetson to change the IPv4 network parameters (Section: \"Changing the IPv4 Address\").\n",
       "\n",
       "So, the answer is: /opt/nvidia/l4t-usb-device-mode/nv-l4t-usb-device-mode-config.sh</b>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from IPython.display import Markdown, display\n",
    "\n",
    "display(Markdown(f\"<b>{response}</b>\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5ebc6b55-4739-4c38-990e-3c7ff87f743f",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
